Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Text multi-label classification method incorporating BERT and label semantic attention
Xueqiang LYU, Chen PENG, Le ZHANG, Zhi’an DONG, Xindong YOU
Journal of Computer Applications    2022, 42 (1): 57-63.   DOI: 10.11772/j.issn.1001-9081.2021020366
Abstract1408)   HTML72)    PDF (577KB)(1235)       Save

Multi-Label Text Classification (MLTC) is one of the important subtasks in the field of Natural Language Processing (NLP). In order to solve the problem of complex correlation between multiple labels, an MLTC method TLA-BERT was proposed by incorporating Bidirectional Encoder Representations from Transformers (BERT) and label semantic attention. Firstly, the contextual vector representation of the input text was learned by fine-tuning the self-coding pre-training model. Secondly, the labels were encoded individually by using Long Short-Term Memory (LSTM) neural network. Finally, the contribution of text to each label was explicitly highlighted with the use of an attention mechanism in order to predict the multi-label sequences. Experimental results show that compared with Sequence Generation Model (SGM) algorithm, the proposed method improves the F value by 2.8 percentage points and 1.5 percentage points on the Arxiv Academic Paper Dataset (AAPD) and Reuters Corpus Volume I (RCV1)-v2 public dataset respectively.

Table and Figures | Reference | Related Articles | Metrics
Classification of steel surface defects based on lightweight network
SHI Yangxiao, ZHANG Jun, CHEN Peng, WANG Bing
Journal of Computer Applications    2021, 41 (6): 1836-1841.   DOI: 10.11772/j.issn.1001-9081.2020081244
Abstract445)      PDF (981KB)(361)       Save
Defect classification is an important part of steel surface defect detection. When the Convolutional Neural Network (CNN) has achieved good results, the increasing number of network parameters consumes a lot of computing cost, which brings great challenges to the deployment of defect classification tasks on personal computers or low computing power devices. Focusing on the above problem, a novel lightweight network model named Mix-Fusion was proposed. Firstly, two operations of group convolution and channel-shuffle were used to reduce the computational cost while maintaining the accuracy. Secondly, a narrow feature mapping was used to fuse and encode the information between the groups, and the generated features were combined with the original network, so as to effectively solve the problem that "sparse connection" convolution hindered the information exchange between the groups. Finally, a new type of Mixed depthwise Convolution (MixConv) was used to replace the traditional DepthWise Convolution (DWConv) to further improve the performance of the model. Experimental results on NEU-CLS dataset show that, the number of floating-point operations and classification accuracy of Mix-Fusion network in defect classification task is 43.4 Million FLoating-point Operations Per second (MFLOPs) and 98.61% respectively. Compared to the networks of ShuffleNetV2 and MobileNetV2, the proposed Mix-Fusion network reduces the model parameters and compresses the model size effectively, as well as obtains the better classification accuracy.
Reference | Related Articles | Metrics
Target-dependent method for authorship attribution
Yang LI, Wei ZHANG, Chen PENG
Journal of Computer Applications    2020, 40 (2): 473-478.   DOI: 10.11772/j.issn.1001-9081.2019101768
Abstract432)   HTML0)    PDF (650KB)(408)       Save

Authorship attribution is the task of deciding who is the author of a particular document, however, the traditional methods for authorship attribution are target-independent without considering any constraint during the prediction of authorship, which is inconsistent with the actual problems. To address the above issue, a Target-Dependent method for Authorship Attribution (TDAA) was proposed. Firstly, the product ID corresponding to the user review was chosen to be the constraint information. Secondly, Bidirectional Encoder Representation from Transformer (BERT) was used to extract the pre-trained review text feature to make the text modeling process more universal. Thirdly, the Convolutional Neural Network (CNN) was used to extract the deep features of the text. Finally, two fusion methods were proposed to fuse the two different information. Experimental results on Amazon Movie_and_TV dataset and CDs_and_Vinyl_5 dataset show that the proposed method can increase the accuracy by 4%-5% compared with the comparison methods.

Table and Figures | Reference | Related Articles | Metrics
Population model of giant panda ecosystem based on population dynamics P system
TIAN Hao, ZHANG Gexiang, RONG Haina, Mario J. PÉREZ-JIMÉNEZ, Luis VALENCIA-CABRERA, CHEN Peng, HOU Rong, QI Dunwu
Journal of Computer Applications    2018, 38 (5): 1488-1493.   DOI: 10.11772/j.issn.1001-9081.2017102551
Abstract460)      PDF (1014KB)(346)       Save
Giant panda pedigree data is an important data base for studying the population dynamics of giant pandas. Therefore, it is of great significance for data modeling of giant panda ecosystems from the perspective of panda conservation. Focused on this issue, a data modeling method of giant panda ecosystem based on population dynamics P system was proposed. Based on the giant panda pedigree data released by Chinese Association of Zoological Gardens, the population characteristics of captive pandas were simulated and researched in China Giant Panda Conservation Research Center from individual behavior. The change rules of reproductive parameters were analyzed in detail, and added to the field released module. Eventually, a population dynamic P system for giant panda was designed releasing-to-the-wild with a two-layer nested membrane structure, a collection of objects and a series of evolution rules which is inline with the characteristics of giant panda. For all giant panda, the maximum relative error between the simulation results and the actual data was within ±4.13% and basically controlled within ±2.7% of P system. The experimental results verify the effectiveness and soundness of the proposed model. It can simulate the population dynamic change trend of giant panda and provide the basis for management decision-making.
Reference | Related Articles | Metrics
Reduction method of test suites based on mutation analysis
WANG Shuyan, CHEN Pengyuan, SUN Jiaze
Journal of Computer Applications    2017, 37 (12): 3592-3596.   DOI: 10.11772/j.issn.1001-9081.2017.12.3592
Abstract489)      PDF (825KB)(588)       Save
The scale of test suites is constantly expanding and the cost of testing is increasing due to the change of test requirements in the process of regression testing. In order to solve the problems, a Reduction method of Test suites based on the analysis of Mutation (RTM) was proposed. Firstly, the test suites were classified and the transaction set matrix of mutants was created in binary numerical form according to whether the designated mutants could be detected or not by test suites. Then, the correlation relation between test suites was obtained by using the improved association mining algorithm. Finally, the test suites were effectively reduced according to these relations. The simulation experimental results of the six classical programs show that, the test suite reduction rate of the proposed RTM can reach 37%. Compared with the traditional greedy algorithm and heuristic algorithm, the proposed RTM improves the test suite reduction rate by 6%, and can guarantee the test coverage rate at the same time, even the test coverage rate of a single test suite increases by 11% on average. The proposed method can meet more test requirements by using fewer test suites, effectively improving test efficiency and reducing test cost.
Reference | Related Articles | Metrics
Simultaneous iterative hard thresholding for joint sparse recovery based on redundant dictionaries
CHEN Peng, MENG Chen, WANG Cheng, CHEN Hua
Journal of Computer Applications    2015, 35 (9): 2508-2512.   DOI: 10.11772/j.issn.1001-9081.2015.09.2508
Abstract451)      PDF (756KB)(274)       Save
For improving recovery performance of signals sampled by sub-Nyquist sampling system with Compressed Sensing (CS), the block Simultaneous Iterative Hard Thresholding (SIHT) recovery algorithm for joint sparse model based on ε-closure was proposed. Firstly, The CS synthesis model for Multiple Measurement Vector (MMV) of sampling system was analyzed and the concepts of ε-coherence and Restricted Isometry Property (RIP) were proposed. Then, according to the block coherence of redundant dictionaries, the SIHT algorithm was improved by optimizing the support sets in iterations. In addition, the iterative convergence constant was given and the algorithm convergence property was analyzed. At last, the simulation experiments show that, compared with traditional method, the new algorithm can achieve recovery success rate of 100% with enough sampling channels, while the noise suppressing ability was increased by 7 dB to 9 dB and the total execution time was brought down by at least 37.9%, with higher convergence speed.
Reference | Related Articles | Metrics
Study of human motion tracking system based on wireless sensor network
CHEN Pengzhan, LI Jie, LUO Man
Journal of Computer Applications    2015, 35 (8): 2316-2320.   DOI: 10.11772/j.issn.1001-9081.2015.08.2316
Abstract401)      PDF (942KB)(356)       Save

To solve the attitude drift, low real-time ability and high price problem in motion capture system based on inertial sensors, a kind of real-time motion capture system was designed to effectively overcome the attitude drift with low cost and power consumption. At first, a distributed joint motion capture node was built based on the human body kinematics principle, and every node worked in low-power mode, when the acquisition data from the node was lower than a predetermined threshold, the node would automatically enter into the sleep mode to reduce the power consumption of the system. In order to reduce the data drift in traditional algorithm, a kind of algorithm combined with inertial navigation and Kalman filter algorithm was designed to calculate the real-time motion data. Using the Wi-Fi module, the TCP-IP protocol was adopted to transmit the attitude data, which could drive the model in real time. At last, the accuracy of the algorithm was evaluated on the multi-axis motor test platform, and the effect of the system for tracking real human motion was compared. The experimental results show that the algorithm has higher accuracy by contrast with the traditional complementary filtering algorithm, which can control the angle drift in less than one degree; and the delay has no obvious lag by contrast with the complementary filter, which can realize the accurate tracking of human motion.

Reference | Related Articles | Metrics
Key technologies of dynamic information database for power systems
HUANG Haifeng ZHANG Keheng ZHANG Hong JI Xuechun CHEN Peng
Journal of Computer Applications    2011, 31 (06): 1681-1684.   DOI: 10.3724/SP.J.1087.2011.01681
Abstract1130)      PDF (650KB)(10311)       Save
In the paper, on the basis of analyzing the structure of dynamic information database, and in combination with the feature of the power system, the key technologies of concurrency data processing, memory-mapped file, disk cache management mechanism and associated data storage were discussed, and the data sampling flow and hybrid compression algorithm were also introduced in detail. The application case in the automatic system of power grid dispatching was introduced and the result proves that the dynamic information database can meet the performance requirement of high-speed data processing.
Related Articles | Metrics
Updated algorithm for mining association rules based on parallel computation
WU Lei,CHEN Peng
Journal of Computer Applications    2005, 25 (09): 1989-1991.   DOI: 10.3724/SP.J.1087.2005.01989
Abstract1075)      PDF (171KB)(929)       Save
Updated solution in parallel implementation of discovery of association rules was studied,and IDD algorithm based on DD algorithm was introduced.After that,HD algorithm,based on IDD and CD algorithms was proposed to solve the problem of distributing the candidate item sets among processors effectively.The last part was the complexity analysis of the IDD and HD algorithms.
Related Articles | Metrics